A preliminary study on improving the recognition of esophageal speech using a hybrid system based on statistical voice conversion
Identifieur interne : 000771 ( Main/Exploration ); précédent : 000770; suivant : 000772A preliminary study on improving the recognition of esophageal speech using a hybrid system based on statistical voice conversion
Auteurs : Othman Lachhab [Maroc] ; Joseph Di Martino [France] ; Elhassane Ibn Elhaj [Maroc] ; Ahmed Hammouch [Maroc]Source :
- SpringerPlus [ 2193-1801 ] ; 2015.
English descriptors
- mix :
Abstract
In this paper, we propose a hybrid system based on a modified statistical GMM voice conversion algorithm for improving the recognition of esophageal speech. This hybrid system aims to compensate for the distorted information present in the esophageal acoustic features by using a voice conversion method. The esophageal speech is converted into a “target” laryngeal speech using an iterative statistical estimation of a transformation function. We did not apply a speech synthesizer for reconstructing the converted speech signal, given that the converted Mel cepstral vectors are used directly as input of our speech recognition system. Furthermore the feature vectors are linearly transformed by the HLDA (heteroscedastic linear discriminant analysis) method to reduce their size in a smaller space having good discriminative properties. The experimental results demonstrate that our proposed system provides an improvement of the phone recognition accuracy with an absolute increase of 3.40 % when compared with the phone recognition accuracy obtained with neither HLDA nor voice conversion.
Url:
DOI: 10.1186/s40064-015-1428-2
PubMed: 26543778
PubMed Central: 4627987
Affiliations:
Links toward previous steps (curation, corpus...)
- to stream Pmc, to step Corpus: 000020
- to stream Pmc, to step Curation: 000020
- to stream Pmc, to step Checkpoint: 000016
- to stream PubMed, to step Corpus: 000031
- to stream PubMed, to step Curation: 000031
- to stream PubMed, to step Checkpoint: 000061
- to stream Ncbi, to step Merge: 000213
- to stream Ncbi, to step Curation: 000206
- to stream Ncbi, to step Checkpoint: 000206
- to stream Hal, to step Corpus: 000828
- to stream Hal, to step Curation: 000828
- to stream Hal, to step Checkpoint: 000239
- to stream Main, to step Merge: 000760
- to stream Main, to step Curation: 000771
Le document en format XML
<record><TEI><teiHeader><fileDesc><titleStmt><title xml:lang="en">A preliminary study on improving the recognition of esophageal speech using a hybrid system based on statistical voice conversion</title>
<author><name sortKey="Lachhab, Othman" sort="Lachhab, Othman" uniqKey="Lachhab O" first="Othman" last="Lachhab">Othman Lachhab</name>
<affiliation wicri:level="3"><nlm:aff id="Aff1">LRGE Laboratory, ENSET, Mohammed 5 University, Madinat Al Irfane, Rabat, Morocco</nlm:aff>
<country xml:lang="fr">Maroc</country>
<wicri:regionArea>LRGE Laboratory, ENSET, Mohammed 5 University, Madinat Al Irfane, Rabat</wicri:regionArea>
<placeName><settlement type="city">Rabat</settlement>
<region nuts="2">Rabat-Salé-Kénitra</region>
</placeName>
</affiliation>
</author>
<author><name sortKey="Di Martino, Joseph" sort="Di Martino, Joseph" uniqKey="Di Martino J" first="Joseph" last="Di Martino">Joseph Di Martino</name>
<affiliation wicri:level="3"><nlm:aff id="Aff2">LORIA, B.P. 239, Vandœuvre-lès-Nancy, 54506 France</nlm:aff>
<country xml:lang="fr">France</country>
<wicri:regionArea>LORIA, B.P. 239, Vandœuvre-lès-Nancy</wicri:regionArea>
<placeName><region type="region" nuts="2">Grand Est</region>
<region type="old region" nuts="2">Lorraine (région)</region>
<settlement type="city">Vandœuvre-lès-Nancy</settlement>
<settlement type="city" wicri:auto="agglo">Nancy</settlement>
</placeName>
</affiliation>
</author>
<author><name sortKey="Elhaj, Elhassane Ibn" sort="Elhaj, Elhassane Ibn" uniqKey="Elhaj E" first="Elhassane Ibn" last="Elhaj">Elhassane Ibn Elhaj</name>
<affiliation wicri:level="3"><nlm:aff id="Aff3">INPT, Madinat Al Irfane, Rabat, Morocco</nlm:aff>
<country xml:lang="fr">Maroc</country>
<wicri:regionArea>INPT, Madinat Al Irfane, Rabat</wicri:regionArea>
<placeName><settlement type="city">Rabat</settlement>
<region nuts="2">Rabat-Salé-Kénitra</region>
</placeName>
</affiliation>
</author>
<author><name sortKey="Hammouch, Ahmed" sort="Hammouch, Ahmed" uniqKey="Hammouch A" first="Ahmed" last="Hammouch">Ahmed Hammouch</name>
<affiliation wicri:level="3"><nlm:aff id="Aff1">LRGE Laboratory, ENSET, Mohammed 5 University, Madinat Al Irfane, Rabat, Morocco</nlm:aff>
<country xml:lang="fr">Maroc</country>
<wicri:regionArea>LRGE Laboratory, ENSET, Mohammed 5 University, Madinat Al Irfane, Rabat</wicri:regionArea>
<placeName><settlement type="city">Rabat</settlement>
<region nuts="2">Rabat-Salé-Kénitra</region>
</placeName>
</affiliation>
</author>
</titleStmt>
<publicationStmt><idno type="wicri:source">PMC</idno>
<idno type="pmid">26543778</idno>
<idno type="pmc">4627987</idno>
<idno type="url">http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4627987</idno>
<idno type="RBID">PMC:4627987</idno>
<idno type="doi">10.1186/s40064-015-1428-2</idno>
<date when="2015">2015</date>
<idno type="wicri:Area/Pmc/Corpus">000020</idno>
<idno type="wicri:explorRef" wicri:stream="Pmc" wicri:step="Corpus" wicri:corpus="PMC">000020</idno>
<idno type="wicri:Area/Pmc/Curation">000020</idno>
<idno type="wicri:explorRef" wicri:stream="Pmc" wicri:step="Curation">000020</idno>
<idno type="wicri:Area/Pmc/Checkpoint">000016</idno>
<idno type="wicri:explorRef" wicri:stream="Pmc" wicri:step="Checkpoint">000016</idno>
<idno type="wicri:source">PubMed</idno>
<idno type="wicri:Area/PubMed/Corpus">000031</idno>
<idno type="wicri:explorRef" wicri:stream="PubMed" wicri:step="Corpus" wicri:corpus="PubMed">000031</idno>
<idno type="wicri:Area/PubMed/Curation">000031</idno>
<idno type="wicri:explorRef" wicri:stream="PubMed" wicri:step="Curation">000031</idno>
<idno type="wicri:Area/PubMed/Checkpoint">000061</idno>
<idno type="wicri:explorRef" wicri:stream="Checkpoint" wicri:step="PubMed">000061</idno>
<idno type="wicri:Area/Ncbi/Merge">000213</idno>
<idno type="wicri:Area/Ncbi/Curation">000206</idno>
<idno type="wicri:Area/Ncbi/Checkpoint">000206</idno>
<idno type="wicri:source">HAL</idno>
<idno type="RBID">Hal:hal-01221503</idno>
<idno type="url">https://hal.inria.fr/hal-01221503</idno>
<idno type="wicri:Area/Hal/Corpus">000828</idno>
<idno type="wicri:Area/Hal/Curation">000828</idno>
<idno type="wicri:Area/Hal/Checkpoint">000239</idno>
<idno type="wicri:explorRef" wicri:stream="Hal" wicri:step="Checkpoint">000239</idno>
<idno type="wicri:doubleKey">2193-1801:2015:Lachhab O:a:preliminary:study</idno>
<idno type="wicri:Area/Main/Merge">000760</idno>
<idno type="wicri:Area/Main/Curation">000771</idno>
<idno type="wicri:Area/Main/Exploration">000771</idno>
</publicationStmt>
<sourceDesc><biblStruct><analytic><title xml:lang="en" level="a" type="main">A preliminary study on improving the recognition of esophageal speech using a hybrid system based on statistical voice conversion</title>
<author><name sortKey="Lachhab, Othman" sort="Lachhab, Othman" uniqKey="Lachhab O" first="Othman" last="Lachhab">Othman Lachhab</name>
<affiliation wicri:level="3"><nlm:aff id="Aff1">LRGE Laboratory, ENSET, Mohammed 5 University, Madinat Al Irfane, Rabat, Morocco</nlm:aff>
<country xml:lang="fr">Maroc</country>
<wicri:regionArea>LRGE Laboratory, ENSET, Mohammed 5 University, Madinat Al Irfane, Rabat</wicri:regionArea>
<placeName><settlement type="city">Rabat</settlement>
<region nuts="2">Rabat-Salé-Kénitra</region>
</placeName>
</affiliation>
</author>
<author><name sortKey="Di Martino, Joseph" sort="Di Martino, Joseph" uniqKey="Di Martino J" first="Joseph" last="Di Martino">Joseph Di Martino</name>
<affiliation wicri:level="3"><nlm:aff id="Aff2">LORIA, B.P. 239, Vandœuvre-lès-Nancy, 54506 France</nlm:aff>
<country xml:lang="fr">France</country>
<wicri:regionArea>LORIA, B.P. 239, Vandœuvre-lès-Nancy</wicri:regionArea>
<placeName><region type="region" nuts="2">Grand Est</region>
<region type="old region" nuts="2">Lorraine (région)</region>
<settlement type="city">Vandœuvre-lès-Nancy</settlement>
<settlement type="city" wicri:auto="agglo">Nancy</settlement>
</placeName>
</affiliation>
</author>
<author><name sortKey="Elhaj, Elhassane Ibn" sort="Elhaj, Elhassane Ibn" uniqKey="Elhaj E" first="Elhassane Ibn" last="Elhaj">Elhassane Ibn Elhaj</name>
<affiliation wicri:level="3"><nlm:aff id="Aff3">INPT, Madinat Al Irfane, Rabat, Morocco</nlm:aff>
<country xml:lang="fr">Maroc</country>
<wicri:regionArea>INPT, Madinat Al Irfane, Rabat</wicri:regionArea>
<placeName><settlement type="city">Rabat</settlement>
<region nuts="2">Rabat-Salé-Kénitra</region>
</placeName>
</affiliation>
</author>
<author><name sortKey="Hammouch, Ahmed" sort="Hammouch, Ahmed" uniqKey="Hammouch A" first="Ahmed" last="Hammouch">Ahmed Hammouch</name>
<affiliation wicri:level="3"><nlm:aff id="Aff1">LRGE Laboratory, ENSET, Mohammed 5 University, Madinat Al Irfane, Rabat, Morocco</nlm:aff>
<country xml:lang="fr">Maroc</country>
<wicri:regionArea>LRGE Laboratory, ENSET, Mohammed 5 University, Madinat Al Irfane, Rabat</wicri:regionArea>
<placeName><settlement type="city">Rabat</settlement>
<region nuts="2">Rabat-Salé-Kénitra</region>
</placeName>
</affiliation>
</author>
</analytic>
<series><title level="j">SpringerPlus</title>
<idno type="eISSN">2193-1801</idno>
<imprint><date when="2015">2015</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc><textClass><keywords scheme="mix" xml:lang="en"><term>Automatic speech recognition (ASR)</term>
<term>Esophageal speech assessment</term>
<term>Pathological voices</term>
<term>Speech enhancement</term>
<term>Voice conversion</term>
</keywords>
</textClass>
</profileDesc>
</teiHeader>
<front><div type="abstract" xml:lang="en"><p>In this paper, we propose a hybrid system based on a modified statistical GMM voice conversion algorithm for improving the recognition of esophageal speech. This hybrid system aims to compensate for the distorted information present in the esophageal acoustic features by using a voice conversion method. The esophageal speech is converted into a “target” laryngeal speech using an iterative statistical estimation of a transformation function. We did not apply a speech synthesizer for reconstructing the converted speech signal, given that the converted Mel cepstral vectors are used directly as input of our speech recognition system. Furthermore the feature vectors are linearly transformed by the HLDA (heteroscedastic linear discriminant analysis) method to reduce their size in a smaller space having good discriminative properties. The experimental results demonstrate that our proposed system provides an improvement of the phone recognition accuracy with an absolute increase of 3.40 % when compared with the phone recognition accuracy obtained with neither HLDA nor voice conversion.</p>
</div>
</front>
<back><div1 type="bibliography"><listBibl><biblStruct><analytic><author><name sortKey="Boll, Sf" uniqKey="Boll S">SF Boll</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct><analytic><author><name sortKey="Doi, D" uniqKey="Doi D">D Doi</name>
</author>
<author><name sortKey="Toda, T" uniqKey="Toda T">T Toda</name>
</author>
<author><name sortKey="Nakamura, K" uniqKey="Nakamura K">K Nakamura</name>
</author>
<author><name sortKey="Saruwatari, H" uniqKey="Saruwatari H">H Saruwatari</name>
</author>
<author><name sortKey="Shikano, K" uniqKey="Shikano K">K Shikano</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Gales, Mjf" uniqKey="Gales M">MJF Gales</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct><analytic><author><name sortKey="Kanungo, T" uniqKey="Kanungo T">T Kanungo</name>
</author>
<author><name sortKey="Mount, D" uniqKey="Mount D">D Mount</name>
</author>
<author><name sortKey="Netanyahu, N" uniqKey="Netanyahu N">N Netanyahu</name>
</author>
<author><name sortKey="Piatko, C" uniqKey="Piatko C">C Piatko</name>
</author>
<author><name sortKey="Silverman, R" uniqKey="Silverman R">R Silverman</name>
</author>
<author><name sortKey="Wu, A" uniqKey="Wu A">A Wu</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Kumar, N" uniqKey="Kumar N">N Kumar</name>
</author>
<author><name sortKey="Andreou, A" uniqKey="Andreou A">A Andreou</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct><analytic><author><name sortKey="Lee, Kf" uniqKey="Lee K">KF Lee</name>
</author>
<author><name sortKey="Hon, Hw" uniqKey="Hon H">HW Hon</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Liu, H" uniqKey="Liu H">H Liu</name>
</author>
<author><name sortKey="Zhao, Q" uniqKey="Zhao Q">Q Zhao</name>
</author>
<author><name sortKey="Wan, M" uniqKey="Wan M">M Wan</name>
</author>
<author><name sortKey="Wang, S" uniqKey="Wang S">S Wang</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Mantilla Caeiros, A" uniqKey="Mantilla Caeiros A">A Mantilla-Caeiros</name>
</author>
<author><name sortKey="Nakano Miyatake, M" uniqKey="Nakano Miyatake M">M Nakano-Miyatake</name>
</author>
<author><name sortKey="Perez Meana, H" uniqKey="Perez Meana H">H Perez-Meana</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Matui, K" uniqKey="Matui K">K Matui</name>
</author>
<author><name sortKey="Hara, N" uniqKey="Hara N">N Hara</name>
</author>
<author><name sortKey="Kobayashi, N" uniqKey="Kobayashi N">N Kobayashi</name>
</author>
<author><name sortKey="Hirose, H" uniqKey="Hirose H">H Hirose</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Pravena, D" uniqKey="Pravena D">D Pravena</name>
</author>
<author><name sortKey="Dhivya, S" uniqKey="Dhivya S">S Dhivya</name>
</author>
<author><name sortKey="Durga Devi, A" uniqKey="Durga Devi A">A Durga Devi</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Qi, Y" uniqKey="Qi Y">Y Qi</name>
</author>
<author><name sortKey="Weinberg, B" uniqKey="Weinberg B">B Weinberg</name>
</author>
<author><name sortKey="Bi, N" uniqKey="Bi N">N Bi</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Rabiner, Lr" uniqKey="Rabiner L">LR Rabiner</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Sharifzadeh, Hr" uniqKey="Sharifzadeh H">HR Sharifzadeh</name>
</author>
<author><name sortKey="Mcloughlin, Iv" uniqKey="Mcloughlin I">IV McLoughlin</name>
</author>
<author><name sortKey="Ahmadi, F" uniqKey="Ahmadi F">F Ahmadi</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Stylianou, Y" uniqKey="Stylianou Y">Y Stylianou</name>
</author>
<author><name sortKey="Cappe, O" uniqKey="Cappe O">O Cappé</name>
</author>
<author><name sortKey="Moulines, E" uniqKey="Moulines E">E Moulines</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Tanaka, K" uniqKey="Tanaka K">K Tanaka</name>
</author>
<author><name sortKey="Toda, T" uniqKey="Toda T">T Toda</name>
</author>
<author><name sortKey="Neubig, G" uniqKey="Neubig G">G Neubig</name>
</author>
<author><name sortKey="Sakti, S" uniqKey="Sakti S">S Sakti</name>
</author>
<author><name sortKey="Nakamura, S" uniqKey="Nakamura S">S Nakamura</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Toda, T" uniqKey="Toda T">T Toda</name>
</author>
<author><name sortKey="Black, W" uniqKey="Black W">W Black</name>
</author>
<author><name sortKey="Tokuda, K" uniqKey="Tokuda K">K Tokuda</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Turkmen, H" uniqKey="Turkmen H">H Türkmen</name>
</author>
<author><name sortKey="Karsligil, M" uniqKey="Karsligil M">M Karsligil</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct><analytic><author><name sortKey="Wuyts, L" uniqKey="Wuyts L">L Wuyts</name>
</author>
<author><name sortKey="De Bodt, Ms" uniqKey="De Bodt M">MS De Bodt</name>
</author>
<author><name sortKey="Molenberghs, G" uniqKey="Molenberghs G">G Molenberghs</name>
</author>
<author><name sortKey="Remacle, M" uniqKey="Remacle M">M Remacle</name>
</author>
<author><name sortKey="Heylen, L" uniqKey="Heylen L">L Heylen</name>
</author>
<author><name sortKey="Millet, B" uniqKey="Millet B">B Millet</name>
</author>
<author><name sortKey="Van Lierde, K" uniqKey="Van Lierde K">K Van Lierde</name>
</author>
<author><name sortKey="Raes, J" uniqKey="Raes J">J Raes</name>
</author>
<author><name sortKey="Van De Heyning, Ph" uniqKey="Van De Heyning P">PH Van de Heyning</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct><analytic><author><name sortKey="Yu, P" uniqKey="Yu P">P Yu</name>
</author>
<author><name sortKey="Ouakine, M" uniqKey="Ouakine M">M Ouakine</name>
</author>
<author><name sortKey="Revis, J" uniqKey="Revis J">J Revis</name>
</author>
<author><name sortKey="Giovanni, A" uniqKey="Giovanni A">A Giovanni</name>
</author>
</analytic>
</biblStruct>
</listBibl>
</div1>
</back>
</TEI>
<affiliations><list><country><li>France</li>
<li>Maroc</li>
</country>
<region><li>Grand Est</li>
<li>Lorraine (région)</li>
<li>Rabat-Salé-Kénitra</li>
</region>
<settlement><li>Nancy</li>
<li>Rabat</li>
<li>Vandœuvre-lès-Nancy</li>
</settlement>
</list>
<tree><country name="Maroc"><region name="Rabat-Salé-Kénitra"><name sortKey="Lachhab, Othman" sort="Lachhab, Othman" uniqKey="Lachhab O" first="Othman" last="Lachhab">Othman Lachhab</name>
</region>
<name sortKey="Elhaj, Elhassane Ibn" sort="Elhaj, Elhassane Ibn" uniqKey="Elhaj E" first="Elhassane Ibn" last="Elhaj">Elhassane Ibn Elhaj</name>
<name sortKey="Hammouch, Ahmed" sort="Hammouch, Ahmed" uniqKey="Hammouch A" first="Ahmed" last="Hammouch">Ahmed Hammouch</name>
</country>
<country name="France"><region name="Grand Est"><name sortKey="Di Martino, Joseph" sort="Di Martino, Joseph" uniqKey="Di Martino J" first="Joseph" last="Di Martino">Joseph Di Martino</name>
</region>
</country>
</tree>
</affiliations>
</record>
Pour manipuler ce document sous Unix (Dilib)
EXPLOR_STEP=$WICRI_ROOT/Wicri/Lorraine/explor/InforLorV4/Data/Main/Exploration
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000771 | SxmlIndent | more
Ou
HfdSelect -h $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd -nk 000771 | SxmlIndent | more
Pour mettre un lien sur cette page dans le réseau Wicri
{{Explor lien |wiki= Wicri/Lorraine |area= InforLorV4 |flux= Main |étape= Exploration |type= RBID |clé= PMC:4627987 |texte= A preliminary study on improving the recognition of esophageal speech using a hybrid system based on statistical voice conversion }}
Pour générer des pages wiki
HfdIndexSelect -h $EXPLOR_AREA/Data/Main/Exploration/RBID.i -Sk "pubmed:26543778" \ | HfdSelect -Kh $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd \ | NlmPubMed2Wicri -a InforLorV4
This area was generated with Dilib version V0.6.33. |